A Geometric Perspective on Speech Sounds
نویسنده
چکیده
In order to effectively approach high dimensional pattern recognition problems, one seeks to understand and exploit any inherent low dimensional structure. Recently, a number of manifold learning algorithms have been motivated by a geometric point of view that models high dimensional data as lying near a low dimensional submanifold of the original space. Our paper has two main goals: (i) to investigate this manifold assumption for natural speech data. It seems intuitive that a human speech producing apparatus with few degrees of freedom would not produce sounds that fill up the acoustic space. We formalize this intuition by considering a concatenated acoustic tube model of the vocal tract and showing that the sounds generated by such a system lie on a low dimensional curved submanifold of the ambient acoustic space. To the extent that this model captures the essence of human speech production, the manifold assumption is true of natural speech data. (ii) to explore the implications of this geometric point of view towards human speech. We show that the manifold structure of speech sounds may be exploited for dimensionality reduction, semi-supervised learning, and speech representation with sometimes striking perfomance improvements in simulated and real speech data. The non-linear geometry of speech sounds suggests new interpretations of phenomena such as the perceptual magnet effect or quantal theory.
منابع مشابه
Semi-supervised learning of speech sounds
Recently, there has been much interest in both semi-supervised and manifold learning algorithms, though their applicability has not been explored for all domains. This paper has two goals: (i) to demonstrate semi-supervised approaches based solely on clustering are insufficient for phoneme classification and (ii) to present a new manifold-based semi-supervised algorithm to remedy this shortcomi...
متن کاملSynthesis: One Vocal Tract Target Configuration Has More than One Sound
Articulatory speech synthesis can be used for speech production research to gain insight into articulation patterns and their acoustic counterparts, the speech sounds. It can be used e.g. to conduct perception experiments that study the relationship between articulation and fine phonetic detail in the acoustic domain. In a case study, we focus on articulatory details in German vowels. Results i...
متن کاملمقایسه تأثیر درمان مبتنی بر آموزش تولید با آموزش حرکات دهانی غیر گفتاری بر گفتارکودکان 6-4 ساله ی مبتلا به اختلال واجی
Objective: speech sound disorders are among the most common speech disorders in children. Non-speech oral motor exercises have long been used as a facilitative activity throughout therapy sessions for a wide variety of speech disorders by speech-language pathologists. But there are few empirical controlled data to evaluate its effectiveness. This study aimed at comparing the effects of therapeu...
متن کاملUsability of Non-speech Sounds in User Interfaces
We review the literature on the integration of non-speech sounds to visual interfaces and applications from a usability perspective and subsequently recommend which auditory feedback types serve to enhance human interaction with computers by conveying useful and comprehensible information. We present an overview over varied tasks, functions and environments with a view to establishing the best ...
متن کاملSpeech development and auditory performance in children after cochlear implantation
Abstract Background: The aim of this study was to determine the auditory performance of congenitally deaf children and the effect of cochlear implantation (CI) on speech intelligibility. Methods: Aprospective study was undertaken on 47 children in a pediatric tertiary referral center for CI. All children were deaf prelingually and were younger than 8 years of age. They were followed up until 5...
متن کامل